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SUMMARY 

DNA sequencing of genomic cDNA clones of avian infectious bronchitis virus (IBV) has been carried out. 
770 bases have been determined which include genomic sequences spanning the 5' termini of the two smallest 
mRNAs of the 3'-coterminal “nested” set: mRNA A and mRNA B. This region contains the complete coding 
sequences for mRNA B which are additional to those present in mRNA A. Two open reading frames are 
present, predicting proteins of M r s 7500 and 9500. 


INTRODUCTION 

Avian IBV, in common with other coronaviruses, 
has a single-stranded, polyadenylated, infectious 
RNA genome approx. 20 kb in length (Stem and 
Kennedy, 1980a). In infected cells multiple sub- 
genomic positive-stranded RNAs are produced 
(Siddell et al., 1983). For IBV and MHV these have 
been shown to consist of a 3'-coterminal “nested” 
set (Stern and Kennedy, 1980a; Lai et al., 1981; 
Leibowitz et al., 1981). For both IBV and MHV the 
messenger function of these subgenomic RNAs has 
been demonstrated (Rottier et al., 1981; Stem et al., 
1982; Siddell, 1983). However, certain differences of 
genome organisation have become apparent between 


Abbreviations: bp, base pairs; IBV, infectious bronchitis virus; 
kb, kilobases or kilobase pairs; MHV, murine hepatitis virus; 
mRNA, messenger RNA; ORF, open reading frame. 


these two coronaviruses. First, MHV has six major 
subgenomic RNAs whereas IBV has only five (Stern 
and Kennedy, 1980b; Lai et al., 1981). Second, the 
coding function of the various messenger RNAs is 
different. Both IBV and MHV contain three main 
structural polypeptides: the nucleocapsid, the 
membrane (El), and the spike or peplomer (E2) 
polypeptides (Cavanagh, 1981; Siddell et al., 1983). 
In both systems the smallest RNA codes for the 
nucleocapsid protein but in MHV the next smallest 
RNA codes for the membrane polypeptide, whereas 
in IBV the membrane polypeptide is coded for by the 
third smallest RNA (Siddell et al., 1980; Siddell, 
1983; Rottier et al., 1981; Stern et al., 1982; Stern 
and Sefton, 1984). These differences are summarised 
in Fig. 1. 

The organisation of the messenger RNAs and in 
vitro translation studies have led to the hypothesis 
that the 5'-most sequences of each mRNA, which 
are not present in the next smallest mRNA, contain 
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Fig. 1. Sequence organisation of MHV and IBV mRNAs. For 
MHV, mRNA 1 is the same length as the genomic RNA. For 
IBV, mRNA F is the same length as the genomic RNA. For IBV 
the approximate sizes of RNAs A, B, C, D and E are 2, 2.4, 3.4, 
4.1 and 7.8 kb (see MATERIALS AND METHODS, section c). 
The coding assignments are shown at the left-hand end of each 
mRNA. N, nucleocapsid; M, membrane or El polypeptide; S, 
spike or peplomer polypeptide. The heavy bar shows the region 
of sequence presented in this paper. Also shown is the position 
of clone C5.136. 

the complete coding sequences for the major protein 
product produced by that messenger species (Stern 
and Kennedy, 1980b; Lai et al., 1981). However, 
since the only coronavirus sequences published are 
those of the smallest RNA species (Armstrong et al., 
1983; Skinner and Siddell, 1983) it has not been 
possible to examine this hypothesis at the RNA 
sequence level. RNA sequence data from this region 
might also enable us to predict the properties and 
thus aid the identification of possible polypeptides 
coded for by mRNA B. 

In this paper we report the nucleotide sequence of 
a cloned cDNA copy of IBV genomic RNA in the 
region corresponding to the 5' end of mRNA B. The 
sequence shows that the 5'-most sequences of 
mRNA B could code for a hydrophobic 7.5-kDal 
protein. 


MATERIALS AND METHODS 

(a) Cloning of IBV genomic RNA 

The preparation of cDNA clones has been pre¬ 
viously described (Brown and Boursnell, 1984). 
Briefly, virion RNA was isolated from IBV strain 
Beaudette grown in embryonated eggs. cDNA was 
produced by oligo(dT)-primed reverse transcription 


of the RNA, followed by self-primed reverse tran¬ 
scription to generate the second strand. S1 nuclease- 
treated cDNA was dC-tailed using terminal transfer¬ 
ase, annealed to dG-tailed RvtI-cleaved pAT153 
(Twigg and Sherratt, 1980) and transformed into 
Escherichia coli HB101. Ampicillin-sensitive colonies 
were selected for further characterisation. 

(b) Characterisation of cDNA clones 

Viral clones were identified by hybridisation with 
a probe prepared by polynucleotide kinase labelling 
of alkali-treated, full-length IBV genomic RNA. 
Restriction sites were mapped on a series of clones 
and this enabled construction of a continuous map, 
3.3 kb in length. That these included the poly(A) 
sequences at 3'-terminus of the viral genome was 
confirmed by hybridisation with a kinase-labelled 
poly(U) probe. 

(c) Formaldehyde-agarose gel analysis of IBV 
mRNAs 

1.5% formaldehyde-agarose gels were run essen¬ 
tially as described by Maniatis et al. (1982). Total 
RNA samples from IBV-infected chick kidney cell 
cultures were run overnight at 60 V on 16 cm vertical 
gels. IBV mRNAs were detected by blotting onto 
nitrocellulose and probing with nick-translated clon¬ 
ed IBV sequences (Maniatis et al., 1982). M r s were 
calculated by comparison with the mobilities of 
DNA restriction fragments and E. coli and chicken 
ribosomal RNAs. 

(d) DNA sequence determination 

Plasmid DNA was prepared by a modification of 
the method of Holmes and Quigley, 1981. DNA 
restriction fragments, 3' end-labelled with [a- 
32 P]dNTPs using Klenow polymerase or 5' end- 
labelled with [y- 32 P]ATP using T4 polynucleotide 
kinase, were sequenced essentially as described by 
Maxam and Gilbert (1980). The depurination 
reaction was carried out in 66% formic acid for 
10 min at 20 0 C, after which the samples were treated 
in the same way as the pyrimidine reaction. For 
sequencing some regions of the DNA, restriction 
digests of the viral insert were recloned into the 
plasmid pUC9 allowing sequencing from adjacent 
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vector restriction sites (Messing and Vieira, 1982). 
Sequence data were stored and analysed on an 
Apple lie microcomputer using the programs of Lar¬ 
son and Messing (1983) and on a VAX 11/780 mini¬ 
computer using the programs of Staden (1984). 


RESULTS 

770 bp of DNA sequence from one IBV genomic 
clone, C5.136 (Brown and Boursnell, 1984) have 
been determined. This sequence corresponds to the 
genomic RNA sequence stretching from 2.40 kb to 
1.63 kb from the 3' end of the viral genome. In 
Fig. 2a the arrows show the direction and extent of 
DNA sequence information obtained from individual 
restriction enzyme cleavage sites. Fig. 2b shows 
positions of restriction sites used in the sequencing. 


95% of the sequence has been determined on both 
strands, and on each strand most regions have been 
sequenced more than once from different restriction 
sites. 

Fig. 2c shows the positions of the initiation and 
termination codons in the three reading frames and 
the positions of the 5' ends of mRNAs A and B as 
determined by SI nuclease mapping (Brown and 
Boursnell, 1984). It should be noted that S1 nuclease 
mapping will determine the 5' end of the “body” of 
the mRNA (see discussion). The DNA sequence 
of 770 nucleotides, with a translation of the three 
main ORFs, is shown in Fig. 3. The lines below the 
sequence at positions 131-165 and 440-466 indicate 

15 3Q 45 60 75 

TACCTTTCAAGTAGATAATGGAAAAGTCTACTACGAAGGAACACCAGTTTTCCAAAAAGGTTGTTGTAGGATGTG 

B 1 

90 10§ 120 135 * 150 

GT CCAATT AT AAGAAAGAAT AATT GAACCACCTACTACACTT ATtTTT AT AAGAG GTG TTT T A CTT AACAAAA AC 


165 180 195 21Q 225 

0 100 200 300 400 500 600 700 800 bases TTAACAAAT ACGGACGATGAAATGGCTGACTAGTTTTGGAAGAGEAGTTATTTCTTGTTATAAATCCCTACTATT 

a I_I_I I_1-1-1_I ) MKWLTSFGRAVISCYKSLLL 


240 255 270 285 300 

AACTCAACTTAGAGtGTTAGATAGGTTAAfTTTAGATCACGGACTACTACGCGTTTTAA£GTGTAGTAGGCGCGt 

TQLRVLDRLILDHGLLRVLTCSRRV 


"It 

t 

| 

| 

| 

| 

1 Accl 

1 

Mlul 

Taql 

Rsal 

Pvull 

Alul 

Alul 






200 300 

400 

500 

600 

700 800 bases 

c 1 _ i _ 

■ 1 

1 

■_ 

• 

.J_1 


1 II 


_i in t 

? i 


11 mu 

J _L 

j ii ii m il m_i_ ulL 

Nucleocapsid 


t __mRNA B 

,___mRNA A 

Fig. 2. Sequencing strategy for cloned IBV cDNA (clone 
C5.136). (a) Arrows show the direction and extent of sequence 
information obtained from individual restriction sites. Arrows 
starting with solid circles indicate sequencing of DNA 3' end- 
labelled with Klenow polymerase. Arrows starting with open 
circles indicate sequencing of DNA 5' end-labelled with polynu¬ 
cleotide kinase, (b) A map of restriction sites used in the sequenc¬ 
ing. (c) The locations of termination codons (vertical bars) and 
potential initiation codons (bars with open circles on top) in the 
three possible translational reading frames. The heavy black 
lines show the main ORFs: M r 7500, M r 9500, and the putative 
nucleocapsid ORF. Also shown are the positions of the 5' ends 
of mRNAs A and B as determined by SI nuclease mapping (see 
Fig. 3). 


315 330 345 360 375 
GCTTTTAGTTCAATtAGATTTAGTTTATAGGTTGGCGTATACGCCCACCCAATCGCTGGtATGAATAATAGTAAA 
LLVQLDLVYRLAYTPTQSLA* 

M N N S K 


390 405 420 43§ * 450 

GATAATCCTTTTCGCGGAGCAATAGCAAGAAAAGCTCGAATTTAtCTGAGAGAAGGATTAGATT GTGTTT ACTTt 

DNPFRGAIARKARIYLREGLDCVYF 


465 480 495 51Q 52§ 

CTTAACAAAGCAGGACAAGCAGAGCCTTGtCCCGCGTGTACCTCtCTAGTATTCCAAGGGAAAACTTGTGAGGAA 

LNKAGQAEPCPACTSLVFQGKTCEE 


540 555 570 585 600 

cacatacataataataatcttttgtcatggcaagcggtaaagcaGctggaaaaacagacgccccagcgccagtcA 

MASGKAAGKTOAPAPVI 

HIHNNNLISWQAVKQLEKQTPQRQS 


615 630 645 660 675 

ttaaactaggaggaccaaaaccacctaaaOtcggttcttctggaaatgcatcttggtttcaagcaataaaagccA 

KLGGPKPPKVGSSGNASWFQAIKAK 
L N * 
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agaagttaaatacacctccgcccaagtttGaaggtagcggtgttcctgataacgaaaacattaagccaagccagc 
KLNTPPPKFEGSGVPDNENIKPSQQ 
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aacatggatactggAgacgcc 
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Fig. 3. 770-bp sequence of part of IBV cDNA clone C5.136. A 
translation of the three main ORFs is shown in single-letter 
amino acid code. Termination codons for these ORFs are shown 
as asterisks. Arrows above the sequence show the 5' ends of 
mRNAs A and B as determined by SI nuclease mapping. Lines 
below the sequence at these points indicate the regions of 
homology which occur at the 5' ends of mRNAs A and B. 
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the regions of homology which occur at the 5' end 
of mRNAs A and B and the arrows above the 
sequence at these points show the 5' ends of the 
bodies of the mRNAs as determined by S1 nuclease 
mapping (Brown and Boursnell, 1984). 

In the genomic sequence immediately 5'-wards of 
the end of mRNA B, there are no ORFs longer than 
81 bases. About 30 bases into mRNA B, at position 
167, there is an AUG codon, followed by an ORF 
of 195 bases. This predicts a protein of M r 7500. The 
coding sequences for this putative protein are entirely 
contained in that part of mRNA B which is not 
present in mRNA A. One base before the termi¬ 
nation codon for this M r 7500 ORF, at position 361, 
there is an initiation codon followed by an open 
reading frame which could code for a protein of M r 
9500. This ORF is not contained within the “unique” 
sequences of mRNA B and overlaps considerably 
with mRNA A. In mRNA A, 100 bases from the 5' 
terminus, at position 552, is the beginning of an ORF 
which extends beyond the limits of the sequenced 
region. This is likely to be the start of the gene coding 
for the nucleocapsid protein. A comparison of this 
partial sequence to the sequences published for the 
nucleocapsid proteins of MHV-A59 and MHV- 
JHM7 (Armstrong et al., 1983; Skinner and Siddell, 
1983) reveals no significant homology at the RNA or 
protein level. Although some homology might be 
expected it should be noted that no serological cross¬ 
reaction between IBV and any of the mammalian 
coronaviruses has been reported (Siddell et ah, 
1983). 


DISCUSSION 

The region of the IBV sequence presented in this 
paper contains the 5' ends, on the viral genome, of 
mRNAs A and B. The messenger RNAs of corona- 
virus IBV are probably transcribed from non-con- 
tiguous regions of the viral genome. Work on the 
murine coronavirus MHV has shown that a common 
5' leader sequence, originating from the extreme 5' 
end of the viral genome, is fused to the body of each 
messenger RNA (Lai et ah, 1982; 1983; Spaanet ah, 
1983; Baric et ah, 1983). It is likely that the IBV 
messenger RNAs have the same structure. However, 
whether or not leader sequences are present, SI 
nuclease mapping experiments have shown that the 


ends of the bodies of mRNAs A and B lie at positions 
139 and 443 respectively (Brown and Boursnell, 
1984) as shown in Fig. 3. 

Of the two ORFs which are present at the 5' end 
of mRNA B, the M r 7500 ORF seems to be the most 
likely candidate for translation in vivo. The location 
of the coding sequence for the putative M r 7500 
polypeptide fits in well with the hypothesis that the 
major polypeptide product of each mRNA is trans¬ 
lated from those 5' sequences not present in the next 
smallest RNA. If the M v 9500 polypeptide were 
translated then this would no longer be true, since its 
coding region stretches well into mRNA A. Further¬ 
more, the RNA sequences flanking the initiation 
codons for the putative M r 7500 and nucleocapsid 
genes correspond well to those preferred for 
functional eukaryotic initiation codons (Kozak, 
1983). However, the sequence GNNAUGA around 
the AUG codon at the start of the M r 9500 ORF is 
rare in this context. In addition the AUG codon at 
the start of the M r 7500 ORF is the first initiation 
codon to occur in the body of mRNA B; since initi¬ 
ation of translation at anything other than the first 
AUG codon is known to be rare (Kozak, 1983) this 
suggests that this ORF codes for the major product 
of mRNA B. 

The amino acid sequence of the putative M r 7500 
polypeptide shows it to be hydrophobic in nature and 
to have an unusual composition in that 26° 0 (17 out 
of 65) of its residues are leucine. Of the six possible 
triplets coding for leucine, one (UUA) is used 8 times 
out of 17. This unusual composition and codon bias, 
which would not be expected from a chance ORF, 
suggests that this polypeptide is translated in vivo. A 
computer analysis of the sequences presented here 
has been carried out using the program AN ALYS EQ 
(Staden, 1984). This program uses certain criteria to 
select one of the three reading frames as being the 
most likely protein coding frame. Although better 
suited to analysing large ORFs, it is interesting to 
note that a search based on looking for codon biases 
above those expected from the base composition 
selects 85% of the M T 7500 ORF as the most likely 
coding frame. The 15% of codons not selected are 
not in a single block, which might have suggested a 
sequencing error leading to an artificial frameshift. 
All of the nucleocapsid which has so far been se¬ 
quenced was selected as the most likely coding 
frame, but none of the M r 9500 ORF. 
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We have carried out in vitro translation of total 
and poly(A) + RNA populations from IBV-infected 
chick kidney cell cultures using a rabbit reticulocyte 
lysate system. However, analysis of the products on 
10-18% polyacrylamide gradient gels containing 
urea could not resolve any small polypeptides due to 
high background in the relevant low-M r range. A 
similar problem has been found by Stern and Sefton 
(1984; Stern, D.F., personal communication) who 
have carried out in vitro translation studies of gel- 
purified and gradient-fractionated mRNAs and have 
identified no major specific product from mRNAs B 
or D. 

At present, therefore, it is not possible to say 
definitively whether either of these polypeptides is 
produced. In a search for small structural polypep¬ 
tides, a [ 3 H]leucine-labelled preparation of virus has 
been analysed on a 12.5% polyacrylamide tube gel 
(Cavanagh, D., personal communication), using the 
phosphate buffer system of Swank and Munkres 
(1971). This showed that there were three detectable 
polypeptides of apparent A/ r s: 16000, 12000, and 
10000. The percentage of the total counts in the gel 
accounted for by these polypeptides was < 1 %. 
Thus, if either of these ORFs codes for a structural 
polypeptide, it must only be present in very small 
quantities. It is also possible that they could code for 
non-structural polypeptides. These would be difficult 
to identify in IBV-infected cells by pulse-labelling 
techniques without using immunoprecipitation to 
lower the background of host-cell incorporation 
which is poorly shut off by IBV infection. The availa¬ 
bility of sequence data for these putative polypep¬ 
tides, however, opens up the possibility of using 
immunoprecipitation with antisera prepared against 
synthetic oligopeptides to search for the presence of 
these polypeptides in IBV-infected cells. 
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